Building Large Machine Reading-Comprehension Datasets using Paragraph Vectors

نویسندگان

  • Radu Soricut
  • Nan Ding
چکیده

We present a dual contribution to the task of machine reading-comprehension: a technique for creating large-sized machine-comprehension (MC) datasets using paragraph-vector models; and a novel, hybrid neural-network architecture that combines the representation power of recurrent neural networks with the discriminative power of fully-connected multi-layered networks. We use the MC-dataset generation technique to build a dataset of around 2 million examples, for which we empirically determine the high-ceiling of human performance (around 91% accuracy), as well as the performance of a variety of computer models. Among all the models we have experimented with, our hybrid neuralnetwork architecture achieves the highest performance (83.2% accuracy). The remaining gap to the human-performance ceiling provides enough room for future model improvements.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Dataset for the First Evaluation on Chinese Machine Reading Comprehension

Machine Reading Comprehension (MRC) has become enormously popular recently and has attracted a lot of attentions. However, existing reading comprehension datasets are mostly in English. To add diversity in reading comprehension datasets, in this paper we propose a new Chinese reading comprehension dataset for accelerating related research in the community. The proposed dataset contains two diff...

متن کامل

Constructing Datasets for Multi-hop Reading Comprehension Across Documents

Most Reading Comprehension methods limit themselves to queries which can be answered using a single sentence, paragraph, or document. Enabling models to combine disjoint pieces of textual evidence would extend the scope of machine comprehension methods, but currently there exist no resources to train and test this capability. We propose a novel task to encourage the development of models for te...

متن کامل

Simple and Effective Multi-Paragraph Reading Comprehension

We consider the problem of adapting neural paragraph-level question answering models to the case where entire documents are given as input. Our proposed solution trains models to produce well calibrated confidence scores for their results on individual paragraphs. We sample multiple paragraphs from the documents during training, and use a sharednormalization training objective that encourages t...

متن کامل

Start and End Interactions in Bidirectional Attention Flow for Reading Comprehension

The reading comprehension machine learning task involves reading in a question and returning an answer from an associated context paragraph. This task has proven to be difficult, as the performance of state-of-the-art models still do not compare with human performance. The difficulty of the tasks comes from understanding two separate pieces of information as well as the relationship between the...

متن کامل

Assignment 4: Reading Comprehension

Reading comprehension is the task of understanding a piece of text by a machine. We train an end-to-end neural network that models the conditional distribution of start and end indices, given the question and context paragraph. We build on top of the baseline suggested in the Assignment, and explore new models to implement attention. We also measure the performance of the models and analyse the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1612.04342  شماره 

صفحات  -

تاریخ انتشار 2016